The Complexity of the Simultaneous Cluster Problem

نویسندگان

  • Zhentao Li
  • Manikandan Narayanan
  • Adrian Vetta
چکیده

We study clustering over multiple graphs each encoding a distinct set of similarity relationships (edges) over the same set of objects (nodes) where the aim is to identify clusters that are supported across the collection of graphs. This problem of simultaneous clustering is readily motivated by the recent deluge of datasets in several domains (including the biological sciences, social sciences, and marketing), where the same objects are repeatedly measured in different conditions, populations or time points. Whilst there has been a vast amount of heuristic work on practical simultaneous clustering problems, little is known on the theoretical side – we present theoretical results that help explain why such heuristics typically come without quantitative guarantees. We give algorithmic and complexity results for simultaneous clustering using two standard measures on clustering quality: density and connectivity. Specifically, we focus on the basic problem of finding a single cluster (rather than an entire clustering) that is simultaneously of high quality in every graph. When the quality of a cluster is its minimum density over all graphs, we show the problem is not approximable within a factor of 2 1−ε , unless NP ⊆ DTIME(n). Furthermore, this problem appears very difficult even when there are just two graphs; the resulting problem is approximately as hard as the problem of finding a dense subgraph on at most k vertices. When cluster quality is a fixed connectivity requirement between terminals within the cluster, there are two natural optimization problems: a maximization version (find a good quality cluster with as many terminals as possible) and a minimization version (find a good quality cluster that is as small as possible). We show that the maximization problem is tractable in polynomial time for any fixed connectivity requirement k. On the other hand the minimization problem is hard to approximate within a factor of 2 1−ε , unless NP ⊆ DTIME(n). The number of graphs in our reduction depends on n. If instead the number of graphs is fixed, we show there is an ε > 0 for which the minimization problem is not approximable within g1/2−ε for any fixed number g of graphs unless NP = ZPP . These hardness results for the minimization problem hold even in the simple cases where the connectivity requirement is one and there are either just two terminal nodes or every node is a terminal node. We remark that our results extend to case where more robust variants of the quality measure are used.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Interference-Aware and Cluster Based Multicast Routing in Multi-Radio Multi-Channel Wireless Mesh Networks

Multicast routing is one of the most important services in Multi Radio Multi Channel (MRMC) Wireless Mesh Networks (WMN). Multicast routing performance in WMNs could be improved by choosing the best routes and the routes that have minimum interference to reach multicast receivers. In this paper we want to address the multicast routing problem for a given channel assignment in WMNs. The channels...

متن کامل

Simultaneous reduction of emissions (CO2 and CO) and optimization of production routing problem in a closed-loop supply chain

Environmental pollution and emissions, along with the increasing production and distribution of goods, have placed the future of humanity at stake. Today, measures such as the extensive reduction in emissions, especially of CO2 and CO, have been emphasized by most researchers as a solution to the problem of environmental protection. This paper sought to explore production routing pro...

متن کامل

A stochastic model for project selection and scheduling problem

Resource limitation in zero time may cause to some profitable projects not to be selected in project selection problem, thus simultaneous project portfolio selection and scheduling problem has received significant attention. In this study, budget, investment costs and earnings are considered to be stochastic. The objectives are maximizing net present values of selected projects and minimizing v...

متن کامل

Solving a multi-depot location-routing problem with heterogeneous vehicles and fuzzy travel times by a meta-heuristic algorithm

A capacitated location-routing problem (CLRP) is one of the new areas of research in distribution management. It consists of two problems; locating of facilities and routing of the vehicle with a specific capacity. The purpose of the CLRP is to open a set of stores, allocate customers to established deposits, and then design vehicle tours in order to minimize the total cost. In this paper, a ne...

متن کامل

Project Scheduling with Simultaneous Optimization, Time, Net Present Value, and Project Flexibility for Multimode Activities with Constrained Renewable Resources

Project success is assessed based on various criteria, every one of which enjoys a different level of importance for the beneficiaries and decision makers. Time and cost are the most important objectives and criteria for the project success. On the other hand, reducing the risk of finishing activities until the predetermined deadlines should be taken into account. Having formulated the problem ...

متن کامل

حل مسئله زمان بندی ماشینهای موازی نامرتبط با اهداف کل زودکرد وزنی و کل دیرکرد وزنی با استفاده از الگوریتم جستجوی پراکنده چند هدفه

The parallel machine scheduling problem is an important and difficult problem to be considered in the real-world situations. Traditionally, this problem consists of the scheduling of a set of independent jobs on parallel machines with the aim of minimizing the maximum job completion. In today's manufacturing systems, in which both early and tardy finishing of job processing are undesired, the o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Graph Algorithms Appl.

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2014